We written a little wrapper function, wrapper_get_meta_data(). This wrapper gets all the Australian station data from the R package, RNOAA, which provides an interface with the ftp server that hosts GHCN-Daily.
as_meta_data = wrapper_get_meta_data()
save(as_meta_data, file = "Data/AS_meta_data.RData")
We have saved this meta data into our package for direct access. There are four precipitation elements. These can be referenced within the meta data using the element codes:
* PRCP is the daily precipitation (tenths of a mm)
* MDPR is a multiday accumulated total (tenths of a mm) * DAPR is the number of days in a multiday total
* DWPR number of days of non-zero rainfall in a multiday total.
load("Data/AS_meta_data.RData")
prcp_meta_data = dplyr::filter(as_meta_data, element == "PRCP")
A summary of some of the some features of the data is given graphically below.
Once we know the station ID we are intersted in, we can get the station data from GHCN-Daily using the RNOAA package.
stn_id = "ASN00040101"
prcp_var <- rnoaa::meteo_pull_monitors(stn_id,
date_min = "1910-01-01",
date_max = "2010-01-01",
keep_flags = TRUE,
var = "PRCP")
To save time, as getting data from the server can take a while, I chose to save the data. Here is an example of how my wrapper works, into which you can pass all arguements from the rnoaa::meteo_pull_monitors() function. I save the data for the set of stn_ids provides into a file of the form paste(data_dir, element_type, file_str, ".rds", sep =""). There is a different file for each precipitation element type. The file string is an option for referenced, with an example of a possible file_str being paste("_xll_", bbox$xll, "_yll_", bbox$yll, "_dx_", dx, "_dy_", dy, ".rds", sep ="").
data_dir = "my_data_dir/"
file_str = "my_fav_stns"
# Restrict the date range to these dates
date_min = "1910-01-01"
date_max = "2017-12-31"
# Get the data from near my hometown
stn_ids = as_meta_data %>%
dplyr::filter(longitude >= 152.6 & longitude < 152.8 &
latitude >= -27.7 & latitude < -27.5 &
element == "PRCP") %>%
dplyr::select(id) %>%
unlist() %>%
as.vector()
# Save out the data
wrapper_save_prcp_data(stn_ids = stn_ids, data_dir = data_dir, file_str = file_str,
date_min = date_min, date_max = date_max)
My application area is extremes, therefore it is important to address missingness within our precipitation data. Some studies use gridded observational products to interpolate missing values, however this can result in underestimating extremes. We have chosen to use the nearest neighbours.
We can get all the nearest neighbours within a given search radius using rnoaa::meteo_nearby_stations().
stn_id = "ASN00040101"
search_radius = 10 #(km)
stn_lat_lon = as_meta_data %>%
dplyr::filter(id == stn_id) %>%
select(id, latitude, longitude) %>%
distinct() %>%
as.data.frame()
nearby_meta_data <- rnoaa::meteo_nearby_stations(lat_lon_df = stn_lat_lon,
station_data = as_meta_data,
var = "PRCP",
year_min = 1910,
radius = search_radius)[[1]]
nbr_plot <- ggplot() +
geom_point(data = nearby_meta_data, aes(x = longitude, y = latitude), col ="blue") +
geom_point(data = stn_lat_lon, aes(x = longitude, y = latitude), col ="red", shape = 3)
ggplotly(nbr_plot)
prcp_var <- rnoaa::meteo_pull_monitors(monitors = nearby_meta_data$id,
date_min = "1910-01-01",
date_max = "2010-01-01",
keep_flags = TRUE,
var = "PRCP")
ggplot(data = prcp_var %>%
filter(qflag_prcp == " ")) +
geom_point(aes(x= date, y = prcp, group = id)) +
xlim(limits = c("2009-01-01", "2011-01-01"))
## Warning: Removed 1819 rows containing missing values (geom_point).
Estimate Correlation
Add Reconstruction Column Data
Get Maxima (raw and reconstructed)
Run Viney Bates on raw data
Run King Test on raw data
Run Chisq Test on raw data
Update Quality Flags
Review Outlier Flags
Save Maxima Data